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Abstract 

Many  problems  in  the  theory  of  sparse  approximation  require  bounds  on  operator  norms  of  a  random  submatrix 
drawn  from  a  fixed  matrix.  The  purpose  of  this  note  is  to  collect  estimates  for  several  different  norms  that  are 
most  important  in  the  analysis  of  i\  minimization  algorithms.  Several  of  these  bounds  have  not  appeared  in  detail. 

Resume 

Sur  la  norme  de  sous-matrice  tiree  aleatoirement.  Beaucoup  de  problemes  en  theorie  d’ approximation  non 
lineaire  demandent  de  majorer  la  norme  d’une  matrice  aleatoirement  extraite  d’une  matrice  fixe  de  plus  grandes 
dimensions.  L’objectif  de  cette  note  est  de  presenter  quelques  estimations  de  ces  normes  qui  se  revelent  etre 
importantes  pour  l’etude  des  algorithmes  de  minimisation  de  type  i\.  Plusieurs  de  ces  bornes  n’ont  pas  encore  ete 
publiees  explicitement. 


1.  Introduction 

We  consider  matrices  written  with  respect  to  the  standard  basis,  and  we  focus  on  three  specific  norms. 
The  norm  ||-||  is  the  usual  Hilbert  space  operator  norm;  the  l\  to  1 2  operator  norm  || •  || computes  the 
maximum  ti  norm  of  a  column;  and  ||-||max  returns  the  maximum  absolute  entry  of  a  matrix.  Throughout, 
{d'j}  is  a  sequence  of  independent  0-1  random  variables  with  common  mean  S.  We  write  R  for  the  square 
diagonal  matrix  whose  jth  diagonal  entry  is  5j ;  the  dimensions  of  R  are  determined  by  context.  The 
symbol  Ep  indicates  the  Lp  norm  of  a  random  variable,  i.e.,  Ep  X  =  (E  |X| p)1/p. 

The  main  theorem  is  a  bound  on  the  spectral  norm  of  a  random  principal  submatrix. 
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Theorem  1.1  (Random  principal  submatrices)  Let  A  be  an  n  x  n  Hermitian  matrix,  decomposed 
into  diagonal  and  off-diagonal  parts:  A  =  D  +  H.  Fix  p  in  [2,  oo),  and  set  q  =  ma x{p,  21ogn}.  Then 


A  partial  case  of  this  theorem  appears  in  [5] .  The  argument  is  based  on  [4]  and  classical  ideas  from  [3] . 
We  apply  the  result  to  sparse  approximation  in  Section  5.  From  this  moment  bound,  tail  probabilities 
can  be  estimated  by  applying  Markov’s  inequality  in  the  usual  fashion. 


2.  Preliminaries 


We  begin  with  some  background.  First,  we  present  a  decoupling  result  for  the  spectral  norm  that  refines 
a  classical  proposition  from  harmonic  analysis  [1]. 

Proposition  2.1  (Decoupling)  Let  H  be  an  Hermitian  matrix  with  a  zero  diagonal.  Then 

Ep  ||RiTR|!  <  2EP  1 1 Riff?' 1 1 

where  the  two  random  restrictions  on  the  right-hand  side  are  independent  and  identically  distributed. 

Proof.  We  establish  the  result  for  p  =  1.  Let  Hjk  be  the  matrix  with  entry  hjk  in  position  (j,  k)  and 
zero  elsewhere.  Let  rjj  be  iid  0-1  random  variables  with  mean  1/2.  By  Jensen’s  inequality, 

E  \\RHR\\  =  E  6. 

II  J  <.AC 

<  2E?7E,5  ||y;  [r]j(l  -  rjk)  +  ?7fc(l  ~  Vj )]  Sj5k{Hjk  +  Hkj)  . 

II  j  <Lk 

There  is  a  0-1  vector  77*  for  which  the  expression  exceeds  its  expectation  over  r).  Let  T  =  {j  :  77*  =  1}. 


E  ||  Rif  ill  |  <  2  E 


Ei6T  Sj  Si*  ( Hj  k  H-  Hkj ) 


keTc 


=  2E 


E 


j'GT  fij^kHjk 
keTc 


=  2E 


E 


SiSlHs 


jGT  °j°k£1jk 
k£T° 


where  {J/}  is  an  independent  copy  of  the  sequence  {/)/}.  The  first  equality  follows  from  a  standard  identity 
for  block  counter-diagonal  Hermitian  matrices.  Now,  the  norm  of  a  submatrix  does  not  exceed  the  norm 
of  the  matrix,  so  we  re-introduce  the  missing  entries  to  complete  the  argument. 


E||flfffl||<2E  Ej¥Jt¥iff 


jk 


=  2E\\RHR,\\  . 


□ 


We  also  need  a  novel  re-coupling  result.  It  is  based  on  the  same  ideas,  so  we  omit  the  proof. 
Proposition  2.2  (Re-coupling)  Let  H  be  an  Hermitian  matrix  with  a  zero  diagonal.  Then 

Ep\\RHR'\\max<4Ep\\RHR\\in^. 

Third,  we  bound  the  expected  maximum  of  a  random  subset  of  nonnegative  scalars.  See  [4,  Lemma  5.1] 
for  related  ideas. 

Proposition  2.3  (Max  of  a  random  subset)  Let  01,02,  ■  ■  . ,  an  be  nonnegative  and  K  =  [d-1]  ■  Then 


E  max  (La,-  <  2  max  —  \ 
J  J  \T\<K  K 


25  x  - 

a,-  <  - -  max  > 

J  1  A  iml  ^  c_l  Z—/-7 


1  -  5  iTi^a-1  z— ' jst 

Proof.  We  may  take  {aj}  nonincreasing.  The  bound  follows  from  acalculation  and  the  fact  K  >  J-1  — 1. 


E  max  5,  a 


^  E  E7=1  53a3  +  a^+i  ^  5  E,=1  ai  +  K  E?=1  a3  ^  K  E7=1  a3 ■ 


□ 
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3.  Maximum  column  norm  of  a  random  submatrix 


This  section  contains  bounds  on  the  maximum  column  norm  of  a  matrix  restricted  to  a  random  set  of 
columns  or  a  random  set  of  rows.  The  first  result  is  an  easy  application  of  Proposition  2.3. 

Theorem  3.1  Let  B  be  an  mx  n  matrix  with  columns  b±, _ ,  bn.  When  p  >  1, 


2(5 

E„  9  <  - -  max 

F  12  1  -  6  |T|<5- 


,l£ 


jGT 


J3\\2 


1/p 


The  second  result  is  for  random  row  restrictions.  A  partial  case  appears  in  [5,  Prop.  13]. 
Theorem  3.2  Let  B  be  an  mx  n  matrix.  For  p  in  [2,oo),  set  q  =  max{p,  2  log  n} .  Then 


Ep\\RB\\1^2<21-25^qEp\\RB 


^WBW,. 


•  2  ’ 


The  proof  relies  on  a  lemma  that  is  established  with  Khintchine’s  inequality. 

Lemma  3.3  Let  X  be  an  m  x  n  matrix.  For  r  G  [l,oo),  choose  q  >  max{r,  21og?r}.  Then 


Er maxfc=i,2,...,n  y  ^£j\xjk\2  <  2°-25vd?  1!^ 


I  max  1 1  ^  1 1 1 — *2  ' 


where  {e^}  is  a  sequence  of  independent  Rademacher  variables. 

Proof.  First,  we  replace  the  maximum  with  the  £q  norm.  Apply  the  inequalities  of  Jensen  and  Khint- 
chine.  Bound  the  sum  over  k  by  a  maximum.  Finally,  apply  Holder’s  inequality: 

l/r 


Er  maxfc  £,.  £j\xjk\  L  E  EJE,  £j\xjk\ 


-Id 


< 


£fcE  y,j£j\xjk  i 


1/9 


’  /  1 , 

2\  9 /2 

1/9 

<cq 

£fe  (^E  £j  £o\Xjk\ 

) 

<  C qn1/q 

maxfc  \xjk  |4 

1/2 


<  C9e0,5  maxj  k  \xjk\  max*.  E  . x  jk 


1/2 


Finally,  recall  that  the  constant  Cq  from  Khintchine’s  inequality  is  bounded  by  2 u'2t>e_u  t>y/g.  □ 

Proof.  (Theorem  3.2)  Define  E  =  Ep  ||l?i?||1^2.  Writing  r  =  p/2,  we  elaborate  the  quantity  E.  Then 
we  center  the  random  variables  and  apply  the  usual  symmetrization  [3,  Lem.  6.3]: 


E2  = 


E  ^maxfc  £  .  \  bjk\ 


l/r 


<  2 


E,5  E£ 


maxfc  £,  £jfij\bjk\ 


l/r 


■S\\B\ 


1— »2  ’ 


Invoke  Lemma  3.3  with  X  =  RB.  Afterward,  Cauchy-Schwarz  results  in 

E2  <  2 125^  [E  IIRBU^  \\RB\\r^2fr  +  8  \\B\\2^2  <  21'25^EP  \\RB\\max  E  +  S  ||B||L2  . 
Solutions  to  the  relation  E2  <  aE  +  (3  obey  E  <  a  +  y7/?-  This  point  completes  the  proof.  □ 


4.  Spectral  norms  of  random  submatrices 

The  proof  of  Theorem  1.1  uses  a  result  of  Rudelson-Vershynin  [4]  to  bound  the  spectral  norm  of 
a  random  column  submatrix.  Its  proof  is  analogous  with  that  of  Theorem  3.2  but  relies  on  a  sharp 
noncommutative  Khintchine  inequality  [2].  The  explicit  constant  was  obtained  in  [5,  Prop.  12]. 
Theorem  4.1  (Rudelson-Vershynin)  Let  B  be  an  m  x  n  matrix  with  rank  r.  For  p  in  [2,  oo),  set 
q  =  max{p,  2  log  r} .  Then 

Ep  II-B.RII  <  Sy^EpWBRW^  +  VS  \\B\\  . 
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Proof.  (Theorem  1.1)  Remove  the  matrix  diagonal,  then  decouple  the  projectors  with  Proposition  2.1: 

Ep  ||iL4fl||  <  2Ep  ||RfTR'||  +  Ep  ||RZTR|| . 

To  estimate  the  first  term,  we  apply  the  Rudelson-Vershynin  theorem  twice,  once  for  for  each  projector: 
Ep \\RHR'\\  <  Z^qEp  WRHR'W^  +  VS Ep  \\R!H\\ 

<  3^qEp  WRHR'W^  +  3^Ep  \\HR' ||^2  +  <5EP  ||JZj| . 

The  maximum  column  norm  bound,  Theorem  3.2,  yields 

Ep  \\RHR'\\  <  3^  fr^y/qE p  ||i?i?i?'||max  +  \/~8Ep  ||HH'||1^2]  +  3^Ep  || HR\\^2  +  <5Ep  ||H|| . 

Since  R!  and  R  are  identically  distributed,  we  combine  the  second  and  third  terms  to  reach 

Ep  || JL4H||  <  15q Ep  ||RHil,||inax  +  l2^SqEp  WHRW^  +  2SEp  ||J3j|  +  Ep  ||RDil||  . 

Finally,  apply  the  re-coupling  result,  Proposition  2.2,  to  the  first  term.  □ 


5.  Random  subdictionaries 

A  dictionary  is  an  m  x  n  matrix  3>  whose  columns  have  unit  f2  norm.  Define  the  hollow  Gram  matrix 
H  =  —  I,  and  note  that  ||.ff||1_>2  <  ||$*(l>||1_>2  =  maxfc  || || 2  <  ||<l>|  .  A  random  subdictionary 

with  expected  cardinality  Sn  is  a  column  submatrix  where  T  =  {j  :  Sj  =  1}. 

The  most  important  statistic  associated  with  a  dictionary  is  the  coherence  p  =  max^j.  |  (ipj,  ipk)  |.  For 
a  set  T  of  columns,  the  local  2-cumulative  coherence  is  the  quantity 

21 1/2 

M2  CO  =  ma xk(£T  [2^  T  |  {<Pj,  <Pk)  I 

Theorem  3.2  allows  us  to  estimate  the  local  2-cumulative  coherence  of  a  random  subdictionary. 
Corollary  5.1  Let  T  =  {j  :  Sj  =  1}.  When  p  =  21ogn,  we  have  Ep  ^(T)  <  d/i^log n  +  VS  ||4>||  . 

Proof.  Observe  that  the  local  coherence  P2(T)  =  ||-R-U(I  —  i?)||1_>2  —  H-R-H” lli_>2  •  Invoke  Theorem  3.2 
along  with  the  facts  ||RiT||max  <  p  and  ||.H’||1_>2  <  ||3>||.  □ 

We  can  use  Theorem  1.1  to  study  the  conditioning  of  a  random  subdictionary  via  the  quantity  || RHR\\ . 
Corollary  5.2  For  p  =  2  log  n,  we  have  the  bound 

Ep\\RHR\\<C  p  log  n  -t-  \J  S  1 1 4?  1 1 2  log  n  .  (1) 

Proof.  Apply  Theorem  1.1  with  A  =  H,  then  introduce  ||i?iF||1^2  <  ||4>||  and  ||i?fffl||max  <  p.  □ 
A  subject  for  further  investigation  is  to  use  Proposition  2.3  to  sharpen  the  first  term  of  the  bracket  in 
(1)  when  p  is  small.  An  elegant  answer  has  remained  elusive. 
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